Insights into Entity Name Evolution on Wikipedia
نویسندگان
چکیده
Working with Web archives raises a number of issues caused by their temporal characteristics. Depending on the age of the content, additional knowledge might be needed to find and understand older texts. Especially facts about entities are subject to change. Most severe in terms of information retrieval are name changes. In order to find entities that have changed their name over time, search engines need to be aware of this evolution. We tackle this problem by analyzing Wikipedia in terms of entity evolutions mentioned in articles regardless the structural elements. We gathered statistics and automatically extracted minimum excerpts covering name changes by incorporating lists dedicated to that subject. In future work, these excerpts are going to be used to discover patterns and detect changes in other sources. In this work we investigate whether or not Wikipedia is a suitable source for extracting the required knowledge.
منابع مشابه
بهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملExploiting multilingual wikipedia to improve arabic named entity resources
This paper focuses on the creation of Arabic named entity gazetteers, by exploiting Wikipedia and using the Naïve Bayes classifier to classify the named entities into the three main categories: person, location, and organization. The process of building the gazetteer starts with automatically creating the datasets. The dataset for the training is constructed using only Arabic text, whereas, the...
متن کاملGenerating a Large-Scale Entity Linking Dictionary from Wikipedia Link Structure and Article Text
Wikipedia has been increasingly used as a knowledge base for open-domain Named Entity Linking and Disambiguation. In this task, a dictionary with entity surface forms plays an important role in finding a set of candidate entities for the mentions in text. Existing dictionaries mostly rely on the Wikipedia link structure, like anchor texts, redirect links and disambiguation links. In this paper,...
متن کاملIdentifying and Extracting Named Entities from Wikipedia Database Using Entity Infoboxes
An approach for named entity classification based on Wikipedia article infoboxes is described in this paper. It identifies the three fundamental named entity types, namely; Person, Location and Organization. An entity classification is accomplished by matching entity attributes extracted from the relevant entity article infobox against core entity attributes built from Wikipedia Infobox Templat...
متن کاملChinese Named Entity Recognition and Disambiguation Based on Wikipedia
This paper presents a method for named entity recognition and disambiguation based on Wikipedia. First, we establish Wikipedia database using open source tools named JWPL. Second, we extract the definition term from the first sentence of Wikipedia page and use it as external knowledge in named entity recognition. Finally, we achieve named entity disambiguation using Wikipedia disambiguation pag...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014